Skip to content

Add per-sample search parameter support via samplesheet#439

Merged
jonasscheid merged 37 commits intonf-core:devfrom
jonasscheid:search-params-samplesheet
Feb 27, 2026
Merged

Add per-sample search parameter support via samplesheet#439
jonasscheid merged 37 commits intonf-core:devfrom
jonasscheid:search-params-samplesheet

Conversation

@jonasscheid
Copy link
Copy Markdown
Collaborator

@jonasscheid jonasscheid commented Feb 6, 2026

Summary

This PR adds support for per-sample search parameters in the samplesheet, enabling users to combine samples with different MHC classes, instruments, or search strategies in a single pipeline run. Crucial step towards SDRF-Support #277.

Key Features

  • SearchPreset column: Choose from predefined presets (lumos/qe/timstof/astral/xl × class1/class2)
  • TSV-based preset definitions: Presets defined in assets/search_presets.tsv with JSON schema validation
  • Full parameter priority chain: CLI params > preset > nextflow.config defaults
  • Preset-aware global FDR: Groups by search preset for proper FDR control
  • Search params propagated end-to-end: From samplesheet through CometAdapter, MS2Rescore, Percolator, IDFilter

Parameter Priority

Search parameters are resolved with the following priority (highest to lowest):

  1. CLI parameters (--fragment_mass_tolerance 0.05) — override everything
  2. Search preset (SearchPreset column → assets/search_presets.tsv) — predefined parameter sets
  3. Config defaults (nextflow.config) — global fallback

CLI override detection uses workflow.commandLine regex matching to determine if a parameter was explicitly set.

Search Preset Parameters

Each preset defines the following parameters:

Parameter Description Example
PeptideMinLength / PeptideMaxLength Peptide length filters 8 / 14
PrecursorMassRange Digest mass range 800:2500
PrecursorCharge Precursor charge states 2:3
PrecursorMassTolerance Precursor mass tolerance 5
PrecursorErrorUnit Tolerance unit (ppm/Da) ppm
FragmentMassTolerance Fragment mass tolerance 0.01
FragmentBinOffset Fragment binning offset 0.0
MS2PIPModel MS2Rescore model Immuno-HCD
ActivationMethod Fragmentation method HCD
Instrument Instrument resolution high_res
NumberMods Max variable modifications 3
FixedMods / VariableMods Modification lists Oxidation (M)

Available Presets

Preset Instrument Class Key Settings
lumos_class1 / qe_class1 / astral_class1 Orbitrap I 8-14 AA, HCD, high_res
lumos_class2 / qe_class2 / astral_class2 Orbitrap II 8-30 AA, HCD, high_res
timstof_class1 timsTOF I 8-14 AA, CID, 20 ppm
timstof_class2 timsTOF II 8-30 AA, CID, 20 ppm
xl_class1 / xl_class2 LTQ/XL I/II CID, low_res, 0.5 Da frag

Users can also provide a custom presets TSV via --search_presets.

Implementation Details

  1. assets/search_presets.tsv — TSV file with 10 predefined presets
  2. assets/schema_search_presets.json — JSON schema for preset validation (with meta key mappings)
  3. resolveSearchParams() in utils_nfcore_mhcquant_pipeline/main.nf — priority resolution: CLI > preset > default
  4. conf/modules.config — CometAdapter/MS2Rescore/IDFilter use meta.X for search params
  5. mhcquant.nf — passes full search meta into RESCORE via dynamic key subtraction
  6. rescore/main.nf — strips search params on emit to maintain downstream join compatibility

Testing

  • ✅ Default test profile passes
  • ✅ Search presets test (test_search_presets) — stable with full peptidoform snapshot
  • ✅ Per-sample parameters correctly applied through CometAdapter, MS2Rescore, IDFilter
  • ✅ Global FDR preset grouping verified

Use Case Example

ID  Sample  Condition  ReplicateFileName    SearchPreset
1   HepG2   ClassI     sample1.mzML         qe_class1
2   HepG2   ClassI     sample2.mzML         qe_class1
3   HepG2   ClassII    sample3.mzML         qe_class2
4   Jurkat  ClassI     sample4.mzML         timstof_class1

This allows combining Class I and Class II samples, or different instruments, in a single run with appropriate parameters for each. A CLI override like --fragment_mass_tolerance 0.05 would override all presets globally.

Checklist

  • This PR is against the dev branch
  • Pipeline tests pass
  • Schema validation passes
  • Code follows nf-core best practices
  • Nextflow strict syntax compliant

jonasscheid and others added 11 commits January 23, 2026 20:43
…plesheet

# Conflicts:
#	modules.json
#	modules/local/easypqp/convert/main.nf
#	modules/local/easypqp/library/main.nf
#	modules/local/epicore/main.nf
#	modules/local/ms2rescore/main.nf
#	modules/local/openms/featurefinderidentification/main.nf
#	modules/local/openms/idconflictresolver/main.nf
#	modules/local/openms/idmassaccuracy/main.nf
#	modules/local/openms/mapaligneridentification/main.nf
#	modules/local/openms/maprttransformer/main.nf
#	modules/local/openms/mztabexporter/main.nf
#	modules/local/openms/psmfeatureextractor/main.nf
#	modules/local/openms/textexporter/main.nf
#	modules/local/openmsthirdparty/featurelinkerunlabeledkd/main.nf
#	modules/local/openmsthirdparty/percolatoradapter/main.nf
#	modules/local/pyopenms/chromatogramextractor/main.nf
#	modules/local/pyopenms/ionannotator/main.nf
#	modules/local/tdf2mzml/main.nf
#	modules/local/untar/main.nf
#	modules/local/unzip/main.nf
#	modules/nf-core/openms/idmassaccuracy/tests/main.nf.test
#	subworkflows/local/map_alignment/main.nf
#	subworkflows/local/prepare_spectra/main.nf
#	subworkflows/local/process_feature/main.nf
#	subworkflows/local/rescore/main.nf
#	subworkflows/local/speclib/main.nf
- Add 12 optional search parameter columns to schema_input.json (SearchPreset,
  PeptideMinLength, PeptideMaxLength, PrecursorMassRange, PrecursorCharge, etc.)
- Add resolveSearchParams() with priority: individual column > preset > global params
- Add search_presets.config with 8 presets (lumos/qe/timstof/xl x class1/class2)
- Convert Sample/Condition integers to strings during samplesheet parsing
- Add test_search_params profile to nextflow.config
- Apply meta.X ?: params.X pattern for per-sample parameter override in
  CometAdapter, IDFilter, MS2Rescore, IDMassAccuracy, and IonAnnotator
- Add -weights flag to PERCOLATORADAPTER_GLOBAL to avoid filename collisions
- Carry search_preset through ch_rescore_in in mhcquant.nf
- Group by preset in RESCORE for IDMERGER_GLOBAL and backfilter via combine(by:0)
- Fix EASYPQP_LIBRARY multi-include syntax in speclib subworkflow
- Change Channel.empty() to channel.empty() (lowercase)
- Replace implicit 'it' with explicit closure parameter in EPICORE call
@nf-core-bot
Copy link
Copy Markdown
Member

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.5.1.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

- Update test_search_params.config to use nf-core/test-datasets samplesheet
- Add tests/search_params.nf.test with snapshot
- Remove local samplesheets folder (now using test-datasets)
…ile directive

- Rename config, test file, and profile from search_params to search_presets
- Add profile directive to nf-test file matching other test patterns
- Use test-datasets samplesheet URL, remove local samplesheet
- Use string concatenation for fasta path matching test.config pattern
…commented lines

- Restore nf-core modules to match upstream/dev
- Remove emit declarations from version topic outputs in local modules
- Remove version emit metadata from local module meta.yml files
- Add peptidoform content check to search_presets nf-test
…napshot

- Format schema_input.json with prettier
- Add search_presets to schema_params lint ignore (nested config map)
- Update ionannotator snapshot for topic channel version deduplication
@jonasscheid jonasscheid marked this pull request as ready for review February 9, 2026 07:50
@jonasscheid
Copy link
Copy Markdown
Collaborator Author

@nf-core-bot fix linting

Copy link
Copy Markdown
Contributor

@axelwalter axelwalter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the new simplification by removing the custom parameters from the sample sheet and just keeping the preset column and optional override with CLI params. This approach is straight forward and removes confusion on which settings apply and also de-clutters the sample sheets significantly.

@jonasscheid jonasscheid merged commit 951739d into nf-core:dev Feb 27, 2026
30 of 33 checks passed
@jonasscheid jonasscheid deleted the search-params-samplesheet branch February 27, 2026 10:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants